Skip to content

Conversation

@gogurtenjoyer
Copy link
Contributor

Summary

Windows treats file handles differently than MacOS and Linux - Model Manager v3 has exposed some issues here, so here's a few patches to solve this. I wrote more but then the power went out, and I'm not rewriting all of that, but I wanted to make sure to thank @skunkworxdark for catching and fixing a serious memory issue.

Related Issues / Discussions

Closes:
#8644
#8636
#8628
#8627

QA Instructions

To test, import various models from Starter Models and elsewhere: GGUF files, Safetensors files, particularly large ones, and Diffusers multi-file models. This fix is specifically for Windows, but the changes should benefit every OS and not break anything on MacOS and Linux. Tested on MacOS and Windows so far.

Merge Plan

Changes the GGUF loader, and import_state_dict in model_on_disk.py - shouldn't conflict with anything; mostly small/remote changes.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable) (n/a)
  • ❗Changes to a redux slice have a corresponding migration (not sure what this is)
  • Documentation added / updated (if applicable) (n/a)
  • Updated What's New copy (if doing a release after this PR)

Wrap gguf.GGUFReader and then use a context manager to load memory-mapped GGUF files, so that they will automatically close properly when no longer needed. Should prevent the 'file in use in another process' errors on Windows.
Additional check for cached state_dict as path is now optional - should solve model manager 'missing' this and the resultant memory errors.
@github-actions github-actions bot added python PRs that change python files backend PRs that change backend files labels Nov 3, 2025
@gogurtenjoyer
Copy link
Contributor Author

gogurtenjoyer commented Nov 3, 2025

I'm sorry but I can't seem to appease ruff here, even when configuring mine with this repo's pyproject.toml.
Edit: fixed 👍

@gogurtenjoyer gogurtenjoyer changed the title Fix installing models on Windows Fix memory issues when installing models on Windows Nov 4, 2025
@lstein
Copy link
Collaborator

lstein commented Nov 8, 2025

@gogurtenjoyer a few equestions that are more related to Model Manager v3 than to your PR, but I encountered them while testing on your PR.

  1. One of my test cases is installing a diffusers-style FLUX LoRA . It installs and is recognized as a diffusers LoRA, but when I try to use it I'm getting a "Is a directory" error. Is this type of LoRA known not to work? It happens on both the main branch and your PR, so it isn't a PR issue, but I wonder if it is a regression that was introduced recently. I thought we did support diffusers LoRAs.
  2. Are these FLUX LoRA .safetensors models supposed to work? https://huggingface.co/XLabs-AI/flux-lora-collection/tree/main . The model probe can't seem to identify them.
  3. MMv3 is not identifying FLUX quantized .gguf models either, for example: https://huggingface.co/city96/AuraFlow-v0.3-gguf/tree/main
  4. And these quantized .gguf models are correctly recognized and installed, but cause a core dump when executed: https://huggingface.co/wikeeyang/SRPO-Refine-Quantized-v1.0/tree/main

Are these known issues with FLUX models?

Copy link
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is at least one .gguf file that works with main and causes a core dump when used with this PR. I tested using flux1-schnell-Q2_K.gguf.

I believe this is related to the WrappedGGUFReader.

@gogurtenjoyer
Copy link
Contributor Author

@lstein - thanks, I'll try that flux schnell model to see what's going on.
For the other questions:

  • I've actually never used a diffusers style lora and have only ever used single file loras, so I'm not sure there.
  • Invoke hasn't ever supported AuraFlow, so no change there.
  • Same as above; this isn't technically a Flux model and Invoke doesn't support it.
  • Never heard of this model before - is this a similar situation to AuraFlow?

@gogurtenjoyer
Copy link
Contributor Author

Okay, the Flux Schnell Q2 linked above installed and ran correctly for me on Windows - is the core dump during generating, or the install?

Here's a few more GGUFs to test (and which we used for testing when trying to triage the Windows issue) - the GGUF sizes/versions are in the buttons along the top:

https://civitai.com/models/630820?modelVersionId=944736
https://civitai.com/models/920261?modelVersionId=1030326

No longer attempting to delete internal object.
@gogurtenjoyer
Copy link
Contributor Author

Suspect the issue is with using _mmap in loaders.py so trying an alternative there. I was bad for attempting this, and thought nestling it within a try/except would make it okay :)

@lstein could you try with this latest fix? @JPPhoto tested as well on Linux (WSL2 I believe!) and it seems to do the trick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend PRs that change backend files python PRs that change python files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants